E cient Search for Approximate Nearest Neighbor in High Dimensional Spaces
نویسندگان
چکیده
We address the problem of designing data structures that allow e cient search for approximate nearest neighbors. More speci cally, given a database consisting of a set of vectors in some high dimensional Euclidean space, we want to construct a space-e cient data structure that would allow us to search, given a query vector, for the closest or nearly closest vector in the database. We also address this problem when distances are measured by the L1 norm, and in the Hamming cube. Signi cantly improving and extending recent results of Kleinberg, we construct data structures whose size is polynomial in the size of the database, and search algorithms that run in time nearly linear or nearly quadratic in the dimension (depending on the case; the extra factors are polylogarithmic in the size of the database). Computer Science Department, Technion | IIT, Haifa 32000, Israel. Email: [email protected] yBell Communications Research, MCC-1C365B, 445 South Street, Morristown, NJ 07960-6438, USA. Email: [email protected] zComputer Science Department, Technion | IIT, Haifa 32000, Israel. Part of this work was done while visiting Bell Communications Research. Work at the Technion supported by BSF grant 96-00402, by a David and Ruth Moskowitz Academic Lecturship award, and by grants from the S. and N. Grand Research Fund, from the Smoler Research Fund, and from the Fund for the Promotion of Research at the Technion. Email: [email protected]
منابع مشابه
What Is the Nearest Neighbor in High Dimensional Spaces?
Nearest neighbor search in high dimensional spaces is an interesting and important problem which is relevant for a wide variety of novel database applications. As recent results show, however, the problem is a very di cult one, not only with regards to the performance issue but also to the quality issue. In this paper, we discuss the quality issue and identify a new generalized notion of neares...
متن کاملNearest Neighbor Search in Multidimensional Spaces Depth Oral Report
The Nearest Neighbor Search problem is deened as follows: given a set P of n points, preprocess the points so as to eeciently answer queries that require nding the closest point in P to a query point q. If we are willing to settle for a point that is almost as close as the nearest neighbor, then we can relax the problem to the approximate Nearest Neighbor Search. Nearest Neighbor Search (exact ...
متن کاملUsing the Distance Distribution for Approximate Similarity Queries in High-Dimensional Metric Spaces
We investigate the problem of approximate similarity (nearest neighbor) search in high-dimensional metric spaces, and describe how the distance distribution of the query object can be exploited so as to provide probabilistic guarantees on the quality of the result. This leads to a new paradigm for similarity search, called PAC-NN (probably approximately correct nearest neighbor) queries, aiming...
متن کاملMetric-Based Shape Retrieval in Large Databases
This paper examines the problem of database organization and retrieval based on computing metric pairwise distances. A low-dimensional Euclidean approximation of a high-dimensional metric space is not efficient, while search in a high-dimensional Euclidean space suffers from the “curse of dimensionality”. Thus, techniques designed for searching metric spaces must be used. We evaluate several su...
متن کاملSIMP: Accurate and Efficient Near Neighbor Search in Very High Dimensional Spaces
Near neighbor search in very high dimensional spaces is useful in many applications. Existing techniques solve this problem efficiently only for the approximate case. These solutions are designed to solve r-near neighbor queries only for a fixed query range or a set of query ranges with probabilistic guarantees and then, extended for nearest neighbor queries. Solutions supporting a set of query...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998